Suffix Vector: Space- and Time-Efficient Alternative to Suffix Trees
نویسندگان
چکیده
Suffix trees are versatile data structures that are used for solving many string-matching problems. One of the main arguments against widespread usage of the structure is its space requirement. This paper describes a new structure called suffix vector, which is not only better in terms of storage space but also simpler than the most efficient suffix tree representation known to date. Alternatives of storage representations are discussed and a linear-time construction algorithm is also proposed in this paper. Space requirement of the suffix vector structure is compared to the space requirement of alternative suffix tree representations. We also make a theoretical comparison on the number of operations required to run algorithms on the suffix vector.
منابع مشابه
The Virtual Suffix Tree: An Efficient Data Structure for Suffix Trees and Suffix Arrays
We introduce the VST (virtual suffix tree), an efficient data structure for suffix trees and suffix arrays. Starting from the suffix array, we construct the suffix tree, from which we derive the virtual suffix tree. The VST provides the same functionality as the suffix tree, including suffix links, but at a much smaller space requirement. It has the same linear time construction even for large ...
متن کاملAn empirical evaluation of a metric index for approximate string matching
In this paper, we evaluate a metric index for the approximate string matching problem based on suffix trees, proposed by Gonzalo Navarro and Edgar Chávez [9]. Suffix trees are used during the index construction to generate intermediate data (pivot table) that to be indexed and the query processing. One of the main problems with suffix trees is their space requirements. To address this, we propo...
متن کاملSpace Efficient Linear Time Construction of Suffix Arrays
We present a linear time algorithm to sort all the suffixes of a string over a large alphabet of integers. The sorted order of suffixes of a string is also called suffix array, a data structure introduced by Manber and Myers that has numerous applications in pattern matching, string processing, and computational biology. Though the suffix tree of a string can be constructed in linear time and t...
متن کاملSpace-efficient K-mer Algorithm for Generalised Suffix Tree
Suffix trees have emerged to be very fast for pattern searching yielding O (m) time, where m is the pattern size. Unfortunately their high memory requirements make it impractical to work with huge amounts of data. We present a memory efficient algorithm of a generalized suffix tree which reduces the space size by a factor of 10 when the size of the pattern is known beforehand. Experiments on th...
متن کاملLinear Time Construction of Suffix Arrays
We present a linear time algorithm to sort all the suffixes of a string over a large alphabet of integers. The sorted order of suffixes of a string is also called suffix array, a data structure introduced by Manber and Myers that has numerous applications in computational biology. Though the suffix tree of a string can be constructed in linear time and the sorted order of suffixes derived from ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002